Bailhache et al. BMC Pediatrics 2013, 13:202 
http://www.biomedcentral.com/1471-2431/13/202 



Pediatrics 



RESEARCH ARTICLE Open Access 



Is early detection of abused children possible?: a 
systematic review of the diagnostic accuracy of 
the identification of abused children 

Marion Bailhache^'^'^', Valeriane Leroy^'^, Pascal Pillet' and Louis-Rachid Salmi^'^''* 



Abstract 

Background: Early detection of abused children could help decrease mortality and morbidity related to this major 
public health problem. Several authors have proposed tools to screen for child maltreatment. The aim of this 
systematic review was to examine the evidence on accuracy of tools proposed to identify abused children before 
their death and assess if any were adapted to screening. 

Methods: We searched in PUBIVIED, PsyclNFO, SCOPUS, FRANCIS and PASCAL for studies estimating diagnostic 
accuracy of tools identifying neglect, or physical, psychological or sexual abuse of children, published in English or 
French from 1961 to April 2012. We extracted selected information about study design, patient populations, 
assessment methods, and the accuracy parameters. Study quality was assessed using QUADAS criteria. 

Results: A total of 2 280 articles were identified. Thirteen studies were selected, of which seven dealt with physical 
abuse, four with sexual abuse, one with emotional abuse, and one with any abuse and physical neglect. Study 
quality was low, even when not considering the lack of gold standard for detection of abused children. In 1 1 
studies, instruments identified abused children only when they had clinical symptoms. Sensitivity of tests varied 
between 0.26 (95% confidence interval [0.17-0.36]) and 0.97 [0,84-1], and specificity between 0.51 [0.39-0.63] and 
1 [0.95-1]. The sensitivity was greater than 90% only for three tests: the absence of scalp swelling to identify 
children victims of inflicted head injury; a decision tool to identify physically-abused children among those 
hospitalized in a Pediatric Intensive Care Unit; and a parental interview integrating twelve child symptoms to 
identify sexually-abused children. When the sensitivity was high, the specificity was always smaller than 90%. 

Conclusions: In 2012, there is low-quality evidence on the accuracy of instruments for identifying abused children. 
Identified tools were not adapted to screening because of low sensitivity and late identification of abused children 
when they have already serious consequences of maltreatment. Development of valid screening instruments is a 
pre-requisite before considering screening programs. 
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Background 

The World Health Organization (WHO) defines child 
maltreatment as "all forms of physical and/or emotional 
ill-treatment, sexual abuse, neglect or negligent treat- 
ment or commercial or other exploitation, resulting in 
actual or potential harm to the child's health, survival, 
development or dignity" [1]. It is a major public health 
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issue worldwide. Gilbert et al. estimated that every year 
in high-income countries about 4 to 16% of children 
were physically abused, one in ten was neglected or psy- 
chologically abused, and between 5 and 10% of girls and 
up to 5% of boys were exposed to penetrative sexual 
abuse during childhood [2]. Child maltreatment can 
cause death of the child or major consequences on men- 
tal and physical health, such as post-traumatic stress dis- 
order and depression, in childhood or adulthood [2]. 
WHO estimated that 155 000 deaths in children younger 
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than 15 years occurred worldwide in 2000 as a result of 
abuse or neglect [3]. 

In France, a retrospective study carried out in three 
regions from 1996 to 2000 showed that many children 
who died from abuse were not identified as abused be- 
fore their deaths. After excluding clear neonaticides, 25 
of 53 (47%) infants who died from suspicious or violent 
death had signs of prior abuse, such as fractures of 
different ages, discovered during post-mortem investiga- 
tions. Only eight of these children were already known 
to be victims of abuse [4]. Similarly, only 33% of children 
who were born in California between 1999 and 2006 and 
died from intentional injury during the first five years of 
life had been previously reported to Child Protection 
Services [5]. Consequendy, children who died from child 
maltreatment can be victims of chronic child abuse while 
they were not diagnosed before their death. Systematic 
early detection of abused children could help prevent 
these deaths and lessen child maltreatment-related mor- 
bidity. However, as in usual screening programs, it is 
important to balance potential positive and negative 
effects and to determine the conditions for a screening 
program of child maltreatment to be effective. A first 
necessary condition is the availability of a test identifj'ing 
correctly abused children before they have serious or irre- 
versible consequences of maltreatment. 

Diagnostic accuracy of ocular signs in abusive head 
trauma and clinical and neuroradiological features asso- 
ciated with abusive head trauma have been already syn- 
thesized [6-9]. In the reviewed studies, however, markers 
identified children when they had already serious conse- 
quences of child maltreatment. Sometimes the diagnosis 
had been done when the child was dead. Furthermore, 
the diagnostic accuracy of markers was not always esti- 
mated, the analysis being limited to estimating the asso- 
ciation between a marker and maltreatment. Similarly, 
diagnostic accuracy of genital examination for identify- 
ing sexually abused prepubertal girls was reviewed [10], 
but tools only identified children who were victims of a 
severe form of sexual abuse (genital contact with pene- 
tration). Furthermore, the sensitivity for several potential 
markers, such as hymeneal transections, deep notches or 
perforations, was never reported. 

Several authors have already considered screening in 
emergency departments [11-13]. A large study in the 
United Kingdom evaluated the accuracy of potential 
makers: child age, type of injuries, incidence of repeat 
attendance, and the accuracy of clinical screening as- 
sessments for detecting physical abuse in injured chil- 
dren attending Accident and Emergency departments 
[13]. They found no relevant comparative studies for in- 
cidence of repeat attendance, only one study which re- 
ported a direct comparison of type of injury in abused 
and non-abused children, and three studies for child 



age. However two of these three studies were limited to 
a subset of children admitted with severe injuries. 
Besides, assessments by the medical team were rarely 
based on standardized criteria, and therefore not re- 
producible and usable in practice [13]. The same team 
published another study about the same markers (age, 
repeated attendance, and type of injury) to identify chil- 
dren victims of physical abuse or neglect among injured 
children attending Emergency departments [14]. They 
found no evidence that any of the markers were 
sufficiently accurate. Thus these two large studies only 
reviewed the accuracy of tests for two types of child 
abuse among children who attended Emergency depart- 
ments and already had injuries. A last study had initially 
the aim of evaluating the accuracy of tools identifying 
early abused children, but only reported an accuracy 
assessment of tools identifying high-risk parents before 
occurrence of child maltreatment [15]. 

The aim of our study was to review the evidence on 
the accuracy of instruments for identifying abused 
children during any stage of child maltreatment evo- 
lution before their death, and to assess if any might 
be adapted to screening, that is if accurate screening 
instruments were available. We define as instruments 
any reproducible assessment used in any types of 
setting. 

Methods 

Search strategy 

Information sources and search terms 

Electronic searches were carried using PUBMED data- 
base from 1966 to April 2012, PsycINFO database from 
1970 to April 2012, SCOPUS database from 1978 to April 
2012, PASCAL and FRANCIS databases from 1961 to 
April 2012, to identify articles published in French or 
English. Search terms used were child abuse, child mal- 
treatment, battered child syndrome, child neglect, Munch- 
ausen syndrome, shaken baby syndrome, child sexual 
abuse, combined with sensitivity, specificity, diagnostic ac- 
curacy, likelihood ratio, predictive value, false positive, 
false negative, validity, test validation, and diagnosis, 
measurement, psychodiagnosis, medical diagnosis, screen- 
ing, diagnosis imaging physical examination, diagnostic 
procedure, scoring system, diagnostic, scoring system, score, 
assessment (Table 1). 

Eligibility criteria 

To be included in this analysis, articles had to 1) state as 
an objective to estimate at least one accuracy parameter 
(sensitivity, specificity, predictive value or likelihood ra- 
tio) of a test identifying abused children (persons under 
age 18); 2) include a reference standard to determine 
whether a child had actually been abused; and 3) de- 
scribe the assessed test, e. g. when the authors presented 
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Table 1 Search terms used to identify potentially eligible 
articles 

Database Search terms 

PUBMED ("child abuse" [IVIesh] or "child maltreatment") 
AND 

("sensitivity and specificity" [Mesh] OR "sensitivity" OR 
"specificity" OR "diagnostic accuracy" OR "likelihood ratio" 
OR "predictive value" OR "false positive" OR "false 
negative") 

PsyclNFO ("battered child syndrome" OR "child abuse") 
AND 

("diagnosis" OR "measurement" OR "psychodiagnosis" OR 
"medical diagnosis" OR "screening") 

SCOPUS ("child abuse" OR "child maltreatment" OR "child neglect" 
OR "battered child syndrome" OR "munchausen 
syndrome" OR "shaken baby syndrome") 

AND 

("diagnosis" OR "measurement" OR "screening" OR 
"diagnostic imaging" OR "physical examination" OR 
"diagnostic procedure" OR "scoring system") 

AND 

("predictive value" OR "diagnostic accuracy" OR 
"likelihood ratio" OR "sensitivity" OR "specificity") 

FRANCIS/ ("child abuse" OR "child maltreatment" OR "child neglect" 
PASCAL OR "child sexual abuse" OR "battered child syndrome" OR 
"munchausen syndrome" OR "shaken baby syndrome") 

AND 

("diagnosis" OR "measurement" OR "screening" OR 
"physical examination" OR "diagnostic" OR "scoring 
system" OR "score" OR "assessment") 

AND 

("test validation" OR "validity" OR "sensitivity" OR 
"specificity" OR "predictive value" OR "diagnostic 
accuracy" OR "likelihood ratio") 



the information and method to carry the assessment, 
and not only the result of this assessment. As there is no 
gold standard for detecting child maltreatment, we de- 
fined acceptable reference standards as: expert assess- 
ments, such as child's court disposition; substantiation 
by the child protection services or other social services; 
diagnosis by a medical, social or judicial team using one 
or several information sources (caregivers or child inter- 
view, child symptoms, child physical examination, and 
other medical record review). The assessment made only 
by the caregiver was not accepted because 80% or more 
of maltreatment, other than sexual abuse, has been esti- 
mated to be perpetrated by parents or parental guardians 
[2]. Thus, the caregiver likely would not want to reveal 
that his child is maltreated. Comparative studies of any 
design examining the results of tools identifying abused 
children in two population groups (abused children and 
not abused children) were accepted (case control, cohort. 



and cross-sectional studies). Descriptive studies with only 
one group of abused or not abused children, of which the 
aim was to estimate one accuracy parameter, were also ac- 
cepted. To avoid missing any potentially relevant tool, no 
particular setting nor category of patients were used as in- 
clusion or exclusion criteria. 

We did not consider tests to identify abusive caregivers, 
abused children after their death or children victims of 
intimate-partner violence. Articles were also excluded 
when they did not provide original data. Tests that identi- 
fied abused children after their death were excluded as 
they are by definition not relevant for early detection. 
Intimate-partner violence, regarded as a separate form of 
child maltreatment by several authors, was excluded be- 
cause the main victim is not the chUd [2]. 

Study selection 

Eligibility of studies was checked by a junior epidemiolo- 
gist and pediatrician (MB), from April, 2012 to May, 2012, 
and the resulting selection checked by a senior medical 
epidemiologist (LRS). Articles were first screened by titles. 
They were excluded when the title showed that the article 
did not address accuracy of tools identifying abused chil- 
dren. If the title did not clearly indicate the article's sub- 
ject, the summary was read. Abstracts were retained for 
full review when they met the inclusion criteria or when 
more information was required from the full text to ascer- 
tain eligibility. 

Data collection process, data items and analysis 

The first assessment of selected papers was done by MB, 
and results were discussed in regular meetings by both ep- 
idemiologists MB and LRS. To reduce the likelihood that 
potentially relevant articles were missed, reference lists 
from relevant articles were checked. From each included 
study, we abstracted information about study design, 
population characteristics, number of participants, screen- 
ing instrument or procedure, abuse or neglect outcome, 
and estimates of diagnostic accuracy. Results were not 
mathematically pooled due to varying methods and types 
of child abuse identified. 

Quality assessment 

The selected studies were assessed by MB and reviewed 
by LRS, using the QUADAS-1 criteria to assess quality 
of studies of diagnostic accuracy [16]. The standardized 
checklist included 15 criteria, grouped according to the 
domains defined by QUAD AS -2 [17]. 
Two criteria related to patient selection: 

1) patients were representative of a spectrum of 
population including all stages of maltreatment 
before the death of the child; 

2) selection criteria were well described. 
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Three criteria related to the index test: 

3) the index test was described in sufficient details to 
permit replication; 

4) when the index test was a score, the cutoff was 
determined before results were available; 

5) the index test was interpreted without knowledge of 
the results of the reference standard. 

Three criteria related to the reference standard: 

6) the reference standard correctly classified patients; 

7) the reference standard was described in sufficient 
details to permit replication; 

8) the reference standard was interpreted without 
knowledge of the results of the index test. 

One criterion related to both the index test and refer- 
ence standard: 

9) the reference standard and the index test were 
independent. 

Five criteria related to flow and timing: 

10) the whole population or a random selection 
received the reference standard; 

11) the study population received the same reference 
standard; 

12) the time period between the reference standard 
and the index test was short enough so the 
situation of the child did not change; 

13) uninterpretable test results were reported; 

14) uninterpretable test results were well-balanced be- 
tween the reference standard and the index test. 

One criterion related to applicability: 

15) same clinical data available when test results were 
interpreted as would be available when the test is 
used in practice. 

Quality of studies was summarized by counting the 
number of criteria that were respected. Results of the final 
selection and analysis where reviewed by another senior 
medical epidemiologist (VL) and a senior pediatrician (PP). 

Assessment of tools adaptation to screening 

Tools were considered adapted to screening, according 
to the WHO criteria on the adequacy of tests used in 
screening programs [18], if they fulfilled the following 
criteria: 1) identify abused children before they have serious 
consequences of child maltreatment; 2) identify abused 
children with a high sensitivity; 3) identify abused children 



with a high enough specificity to avoid stigmatization of 
caretal<ers who were not abusers. 

Results 

Study selection 

Of 2 280 references identified in the databases, 524 were 
selected from their title, of which 137 abstracts were 
read; after exclusion of duplicates, 92 full articles were 
assessed (Figure 1). Studies excluded for lack of refer- 
ence standard were case-control studies with control 
groups recruited in the general population without verify- 
ing if children were abused or not. Studies were excluded 
when the reference standard was only the opinion of care- 
givers who had been asked whether their children were 
abused or not. One study was excluded because the 
method of the index text, an assessment by primary care 
clinicians, was not described [19]. Finally, one study was 
excluded because an unknown number of children less 
than fifteen years old examined in a medical center, who 
should have been tested during the study period, had not 
received the index test but were not registered [20]. This 
limit was noticed because several abused children identi- 
fied by the reference standard and who had inclusion cri- 
teria, had not received the index test by the medical team 
and were not reported. Thirteen articles met the inclusion 
criteria. The outcome of interest was sexual abuse in four 
studies [21-24], physical abuse in seven [25-31], psycho- 
logical abuse in one [32], and several forms of child mal- 
treatment (physical abuse, psychological abuse, sexual 
abuse, and physical neglect) in one [33]. Eight studies were 
prospective [21-26,32,33], and five retrospective assess- 
ment of the diagnostic accuracy [27-31]. 

Quality of studies 

The maximum number of quality criteria met was eight 
of fourteen, and five studies met four or less criteria 
(Table 2). The accuracy of the reference standard was 
never determined because no gold standard to identify 
abused children is available. We could not judge patients 
representativeness, by lack of sufficient information about 
methods of patient recruitment [21,24,26,28,30-33], or re- 
fusal by many families, for undocumented reasons [22,23] . 
In three studies, details on the imaging technique or 
assessment of impact trauma were not sufficiently de- 
scribed to replicate the index test [25,27,28]. The reference 
standard was different in the three case-control studies 
[21,22,31]. In one study, the result of the index test was 
used to establish the final diagnosis [23]. The time period 
between the two tests was rarely available; in one study, it 
was on average 36.4 weeks, so that the situation about 
child abuse could have changed [33]. We could not judge 
if the circumstances of test evaluation were the same than 
in routine practice, by lack of information about the kind 
of practice considered [22,25-29,31,33]. 
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2280 articles identified in 
PUBMED, PsyclNFO, SCOPUS, FRANCIS, PASCAL 
To April 2012, in English or French 



' ' 

524 articles screened by titles 










137 articles screened by abstracts 



92 articles after duplicates removed and 
screened by full-text 



13 studies Included 



1756 No evaluation of diagnostic accuracy 



301 No evaluation of diagnostic accuracy 
57 Outcome of interest * child maltreatment 
24 Test identifying victim In adulthood 
5 Test identifying abusive caregiver 



43 No evaluation of diagnostic accuracy 

2 Outcome of interest * child maltreatment 

8 Test identifying victim in adulthood 

5 Test identifying abusive caregiver 

15 No reference standard or no acceptable 

1 Test not enough described to be reproducible 

2 Unable to verify the estimations because of inconsistent or 
missing data 

2 Study only available in abstract form 

1 Study protocol not respected 



Figure 1 Diagram illustrating the study selection process, April 2012. 



Diagnostic accuracy 
Identification of physical abuse 

Four studies were about children with inflicted head in- 
jury (Table 3) [25-28]. One test identified abused chil- 
dren among those admitted to a tertiary care pediatric 
hospital for acute traumatic intracranial injury, when 
caregivers reported no history of trauma or a history of 
low-impact trauma, i.e. with a fall from < 3 feet or with 
other low-impact non-fall mechanisms [27]. The other 
tests identified abused children by using findings of phys- 
ical examination or Computer Tomographic among chil- 
dren hospitalized in Pediatric Intensive Care Units [25,26], 
Neurosurgical [25,26] or Emergency departments [25,26] 
or a regional pediatric medical center [28] for head trauma. 
A prediction rule combining four variables (hygroma; con- 
vexity subdural hematoma without hygroma; no fracture; 
and interhemispheric subdural hematoma in Computer 
Tomographic images at clinical presentation) could iden- 
tify 84% of abused children [28]. 

Three studies estimated accuracy of tests identifying 
physical abuse and were not limited to intentional head 
trauma [29-31]. A decision tool based on three questions 
(age of child; localization of bruise during the initial 72 
hours of patient's admission; and confirmation of acci- 
dent in public setting) identified abused children among 
children aged 0 to 4 y admitted to a Pediatric Intensive- 
Care Unit, with a sensitivity of 97% (95% CI: 84-100) 
[31]. In another study, presence of bruises in the same 
body site than a fracture identified 26% of abused chil- 
dren among children with acute fractures referred for 



possible child abuse to a specialized team [30]. Finally, a 
score was developed to identify physical abused children 
14 years old or younger, with at least one diagnosis of in- 
jury as defined by the International Classification of Dis- 
ease (ICD-9), 9**^ revision (codes 800 to 959), in 1961 
hospitals in 17 states of the United States. The 26-point 
score based on presence of fracture of base or vault of 
skull (1 point), eye contusion (3 points), rib fracture (3 
points), intracranial bleeding (4 points), multiple burns 
(3 points), and age of the child (3 points for age group 
1-3 y, 12 points for age group 0-1 y) identified 87% of 
physical abused child when the score was > 3 [29]. 

Identification of sexual abuse 

The sensitivity of tests using the results of children anal 
and genital examination were estimated at best at 56% 
(95% CI: 33-77), and the specificity at 98% (95% CI: 91- 
100) [22,23] (Table 4). The frequency of a variety of sex- 
ual behaviors of the child over the previous six months 
prior to assessment was not associated with sexual abuse 
[24]. A list of 12 symptoms expressed by the child, such 
as difficulty getting to sleep, change to poor school per- 
formance, or unusually interest about sex matters, iden- 
tified sexual abused children when caretakers reported 
at least three symptoms, with a sensitivity of 91% and a 
specificity of 88% [21]. The setting in which the studies 
took place were consultations with specialized team in 
child abuse, or when a control group was chosen, con- 
sultations at pediatric clinics for well-child examination 
or others complaints. 



Table 2 Quality of studies of tlie diagnostic accuracy of tests identifying child neglect or abuse 



S n 



Berenson Bernstein Chang Cheung Drach et al, Fernando- Hettler et al. Pierce et al, Valvano Vinchon Vinchon Wells Wells m g 

etal, 2002 et al, et al, et al, 2001 [24] pulle et al, 2003 [27] 2010 [31] et al, et al, et al, et al, et al, n 

[22] 1997 [33] 2005 [29] 2004 [23] 2003 [32] 2009 [30] 2010 [25] 2005 [26] 2002 [28] 1997 [21] S§ 

1. Representative spectrum Unclear Unclear Yes Unclear Unclear Unclear Yes Unclear Unclear No Unclear Unclear Unclear 

of patients § 5' 

2. Description of selection Yes No Yes No No No Yes No No Yes No No No S 2 
criteria 7^ ~ 

1^ w 

3. Replication of the index Yes Yes Unclear Yes Yes Yes No Yes Yes No Unclear No Yes uj g 
test 



Criteria of quality Studies 



4. Cutoff determined before Yes No No NA* Yes No NA* No NA* NA* NA* No No 
results were available 

5. Interpretation without Unclear Yes Unclear Unclear Unclear Yes Unclear No Unclear Unclear Unclear Yes Unclear 
knowledge of the results of 

reference standard 

6. Classification by reference Unclear Unclear Unclear Unclear Unclear Unclear Unclear Unclear Unclear Unclear Unclear Unclear Unclear 
standard 

7. Replication of the reference No No No Yes No No No No No No No No No 
standard 

8. Interpretation without Unclear Yes Unclear Unclear Yes Yes Unclear Yes Yes Unclear Unclear Yes Unclear 
knowledge of the results 

of index test 

9. Independence of Yes Unclear Unclear No Yes Yes Yes Unclear Yes Unclear Unclear Unclear Unclear 
reference and index tests 

1 0. Systematic reference Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes 
standard 

1 1 . Same reference standard No Yes Yes Yes Yes Yes Yes No Yes Yes Yes Yes No 

12. Short enough time Yes No Yes Unclear Unclear Unclear Yes Unclear Unclear Unclear Unclear Unclear Unclear 
period between reference and 

index tests 

13. Uninterpretable results Yes No No No No No Unclear No No No No No No 
reported 

14. Uninterpretable results Yes Unclear Unclear Unclear Unclear Unclear Unclear Unclear Unclear Unclear Unclear Unclear Unclear 
balanced 

15. Same clinical data Unclear Unclear Unclear Yes Yes Yes Unclear Unclear Unclear Unclear Unclear Unclear No 
available as in routine 

*NA Not Applicable. 



O 
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Table 3 Description of selected studies estimating diagnostic accuracy of tests identifying physical abused children 

Source Inclusion criteria Form of Index test Sample Reference standard Sensitivity Specificity 

child abuse size „, „, 



(95% CI) (95% CI) 



Vinchon et al, 
2010 [25] 


Children <2 y referred 
alive to Emergency, PICU* 
or NDt for HT+ with 
cerebral scan 


hflicted 
head injury 


Severe RH§ 
Brain ischemia 
SDHII 

No scalp swelling 


84 


Assessment by forensic 
neurosurgeon, pediatrician, 
psychologist, social worker 


57 
27 
27 
98 


97 
97 
97 
77 


Vinchon et al, 
2005 [26] 


Children <2 y referred 
alive to Emergency, PICU* 
or ND+ for HT+ with 
cerebral scan 


hflicted 
head injury 


RH § Grade 1, 2 or 3 
RH § Grade 2 or 3 


207 


Assessment by forensic 
neurosurgeon, pediatrician, 
psychologist, 
ophthalmologist, socia 
worker 


75(62-86) 
66(52-78) 


93(85-78) 
100(95-100) 


Hettler et al, 
2003 [27] 


Children < 3 y 
hospitalized for HTi with 
intracranial hemorrhage 


hflicted 
head injury 


No history of trauma 
or low-impact trauma 


163 


Assessment by medical team 
integrating witnessed or 
confessed abuse, predefined 
specific findings during 
physical child examination 


69(55-82) 


97(83-100) 


Wells et al, 
2002 [28] 


Children <3 y hospitalized 
for HT^: with intracranial 
hemorrhage 


hflicted 
head injury 


Score integrating CTI 
imaging patterns 


257 


Assessment by medical team, 
integrating history, age and 
sex of child, results of official 
investigation, medical records 
excluding CTH 


84(78-90) 


83(74-90) 


Pierce et al, 
2010 [31] 


Newborn to 4 y 
hospitalized in PICU* for 
trauma 


Physical 
abuse 


Decision tool 
integrating bruise 
region, age of child, 
trauma history 


95 


Assessment by medical, 
juridical team, and CPS** 


97(84-100) 


84(69-94) 


Valvano et al, 
2009 [30] 


Children <18 y referred to 
specialized team with 
fracture, excluded head 


Physical 
abuse 


Bruise in the same 
body sitestt than 
fracture 


150 


Expert assessment integrating 
history, type of injuries and 
familial characteristics 


26(1 7-36) 


75(62-85) 


Chang et al, 
2005 [29] 


children < 14 y with at 
least one trauma 
diagnostic with ICD-9++ 


Physical 
abuse 


SIPCA§§, score 
integrating age of 
child, physical 
examination and 
results of imaging 


58 558 


E codes and certain ICD-9 
codes^^ 


87(84-90) 


81(81-81) 



*PICU Pediatric Intensive Care Unit. 

t ND Neurosurgical Department. 

^ HT Head Trauma. 

§ RH Retinal Hemorrhage. 

II SDH Subdural Hematoma. 

H cr Computed Tomographic. 

**CPS Child Protection Service. 

tt Seven body sites: four extremities, torso, pelvis and head/neck. 
W ICD International Classification of Diseases, Ninth Revision. 
§§ SIPCA Screening Index for Physical Child Abuse. 



Identification of psycliological abuse 

In a self-administered questionnaire, children were ex- 
pected to indicate how often they experienced a given 
parental/caregiver behavior (Table 4). The scale was ad- 
ministered to children aged 13-15 years without spe- 
cific complaints attending a school within the city of 
Colombo. At a cutoff of 95 and greater, 20 of 26 abused 
children were identified [32]. 

Identification of several forms of child maltreatment 

The Childhood Trauma Questionnaire is a 70-item screen- 
ing inventory that assesses self-reported experiences of 
abuse and neglect in childhood and adolescence (Table 4). 



Accuracy was estimated for each form of child maltreat- 
ment in an adolescent psychiatric population. Physical neg- 
lect was defined as the failure of caretakers to provide for a 
child's basic physical needs like food or clothing. The esti- 
mated sensitivity and specificity were the best for sexual 
abuse. The sensitivity were estimated at 86% (95% CI: 71- 
94), and the specificity at 76% (95% CI: 67-83) [33]. 

Adaptation to screening 

Identified tools were not adapted to screening because 
of low sensitivity and late identification of abused chil- 
dren when they have already serious consequences of 
maltreatment. 
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Table 4 Description of selected studies estimating diagnostic accuracy of test identifying abused children, excluding 
physical abuse 



Source 



Inclusion Criteria 



Form of child 
abuse 



Sample Index Test 
size 



Reference Standard 



Sensitivity Specificity 

% % 
(95% CI) (95% CI) 



Cheung et al, 
2004 [23] 



Berenson et al, 
2002 [22] 



Drach et al, 
2001 [24] 



Wells et al, 
1997 [21] 



Fernan-dopulle 
et al, 2003 [32] 

Bernstein et al, 
1997 [33] 



Children <18 y, 
referred to 
specialized team* 



Girls 3-8 y referred 
to specialized team* 
or consulting at the 
pediatric clinics 



Children 2-12 y 
referred to SCAP 
teamll 

Boy < 1 8 y referred 
to CPS or consulting 
for well-child 
examination 

Children 

13-15 y in school 
Children 



Sexual abuse 



Sexual abuse with 
penetration 



77 



Sexual abuse 



Sexual abuse 



Emotional abuse 



Physical abuse 



209 



74 



12-17 y hospitalized Emotional abuse 

in psychiatry ^ , , 

Sexual abuse 

Physical neglect 



Classification of anal 
and genital 
examination findings 



Horizontal diameter 
of the hymen > or < 
6.5 mm in knee-chest 
position 



CSBIt parental 
interview about child 
sexual behavior 

SASA1, parental 
interview integrating 
12 child symptoms 



Assessment by medical 
team integrating medical 
history, children behavior, 
laboratory results, 
anogenital findings 

Assessment by nurse, 
psychologist or social 
worker integrating 
children interview, CSBIt 
and assessment by CPS+. 
Assessment by nurse 
integrating D/P vulvar 
Penetration Rating Scale§ 

Expert assessment 
integrating child 
interview, history and 
physical examination 

Assessment by CPS or by 
a series of screening 
techniques 



56 (33-77) 98(91-100) 



29 (22-36) 86(81-91) 



50 (37-63) 50 (42-58) 



91 (71-99) 88 (77-96) 



Self-report 



Psychiatrist's assessment 77 (56-91) 51(39-63) 



questionnaire directed during child interview 
to children 



CTQ**, self-report 
questionnaire directed 
to children 



Assessment by therapists 
integrating structured 
child interview, follow-up 
information and assess- 
ment of CPS+ 



82 (70-90) 
79 (66-88) 
86 (71-94) 
78 (62-89) 



73 (63-81) 
72 (62-80) 
76 (67-83) 
61 (53-70) 



*Team evaluating children during reporting to Child Protection Services, 
t CSBI Child Sexual Behavior Inventory, 
t CPS Child Protection Services. 

§ Score evaluation the probability of sexual penetration. 

II Spurwink Child Abuse Program for identifying abused children in Oregon. 

11 SASA Signs Associated vj\th Sexual Abuse. 

**CTQ Childhood Trauma Questionnaire. 



Discussion 

Assessment of the accuracy of instruments is difficult, 
because there is no gold standard for identifying abused 
children. To optimize the reference standard, opinion of 
experts or medical, social or judicial teams are usually 
used [21,24-28,30-33], but the accuracy of these assess- 
ments is not known. Furthermore, the information used 
for this assessment was rarely specified so that it was diffi- 
cult to verify the independence between the index test and 
the reference standard. The incorporation of index test re- 
sults in the reference standard would overestimate accur- 
acy of the test [21,25,26,28,29,31,33]. Chang et al used the 
International Classification of Diseases (ICD), 9* Revision, 
and E-codes (External cause), used to categorize intent 
and mechanism of an injury, for reference standard [29]. 
In a recent study in the Yale-New Haven Children's hos- 
pital from 2007 to 2010, the specificity of coding injuries 
as physical abuse was 100% (95% CI: 96-100). But the 



sensitivity was low: among the 43 cases determined to be 
abused by the Child Abuse Pediatrician, four were mis- 
coded as accidents, two as injuries of undetermined cause, 
and four did not receive any injury code [34]. In 1991- 
1992 in California, the sensitivity of hospital E-coded data 
in identifying child victims of intentional injuries had been 
estimated at 75% (95% CI: 64-84) [35]. This classification 
underestimates the number of abused children, therefore 
does not seem to be a good reference test. Cases of child 
physical abuse are considered as accidents and cases clas- 
sified as physical abuse are not representative of all the 
cases of physical abuse, because some cases did not re- 
ceive any injury code. 

In this systematic review, the quality of selected stud- 
ies was low, even when not considering the criterion re- 
lated to the reference standard. Available information 
was often insufficient to make a judgment for many cri- 
teria. Some of the limitations, for instance the utilization 
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of the index test to establish the final diagnostic, are par- 
ticularly worrisome as they reflect an important miscon- 
ception of what is good diagnostic research. This overall 
poor quality likely limits the validity of the selection of 
studies, as many could have been excluded on the basis 
of quality alone. Clearly, the quality of reporting of stud- 
ies of diagnostic accuracy on child maltreatment needs 
to improve. Furthermore in five studies, the retrospect- 
ive evaluation based on a review of records could have 
introduced bias [27-31]. And in the three case-control 
studies, the performance of index test could have been 
overestimated because of the increase of differences be- 
tween both groups by excluding children for whom mal- 
treatment is difficult to diagnose [21,22,31]. 

We were interested in tools identifying abused children 
as early as possible in the evolution of child maltreatment. 
Existing instruments reported to diagnose child maltreat- 
ment were not designed for screening. Many tools identify 
abused children when they have already clinical conse- 
quences of chUd maltreatment, such as head injury, frac- 
ture, or behavior problems [21,24-31]. The identification 
of abused children already at the clinical stage comes too 
late. The performance of tests was also not adapted to 
screening. Screening instruments require high sensitivity 
for missing very few abused children. In our synthesis, 
most sensitivity estimations were low [22-27,30,32,33]. 
Furthermore, the specificity of tests is also important 
because of the negative effects of a misidentification, in 
particular the psychological impact and the effect of a po- 
tential stigmatization on the child and his parents [36] . As 
usual, when the sensitivity of the test was high, the specifi- 
city was often low [25]. The sensitivity was greater than 
90% and the specificity greater than 80% only for two tests 
[21,31]. However, one was a decision tool to identify 
physically abused children among those hospitalized in a 
Pediatric Intensive Care Unit, so that children had severe 
injuries [31]. The other test was based on twelve child 
symptoms to identify sexually-abused children [21]. These 
symptoms could be severe psychological consequences 
as depression: sudden emotional and behavior changes, 
changes to poor school performance, frequent stomach- 
aches, difficulty getting to sleep or sleeping more than 
usual. 

Child maltreatment is the "disease" of both the child and 
his caregiver. Obviously, an abusive caregiver is defined by 
his abusive behavior and child maltreatment begins by abu- 
sive behavior of caregiver. This abusive behavior is respon- 
sible for poor health and development of the child. Thus, 
identification of child maltreatment could consider the 
identification of both the abused child and his abusive 
caregiver. Two self-report questionnaires were directed to 
children who had to indicate if they had experienced given 
behaviors of parents or caregivers [32,33] . As only children 
old enough for reading could answer, these questionnaires 



cannot help reduce deaths in the most vulnerable groups. 
Indeed, fatal child maltreatment occurs most frequently 
when children are younger [2,37-39]. Over a half of the 
600 victims of child maltreatment under five years reported 
to the National Violent Death Reporting System of the 
United States of America from 2003 to 2006 were under 
one-year-old [40]. 

The WHO definition of child maltreatment is prob- 
lematic as it is defined by consequences of neglectful or 
abusive behaviors that, themselves, are not defined [1,3]. 
Similarly, the Article 19 of the United Nations convention 
on the rights of the child, stating "all forms of physical or 
mental violence, injury and abuse, neglect or negligent 
treatment, maltreatment or exploitation, including sexual 
abuse" does not define these behaviors. Moreover, pro- 
posed definitions based only on abusive behaviors can vary 
widely. For example, physical contact or penetration are 
applied before defining reported experiences as sexual abu- 
sive by some authors and not others [41-44]. Instruments 
designed to diagnose abusive caregivers such as the Child 
Abuse Potential Inventory [45], the International Society 
for the Prevention of Child Abuse and Neglect (IPSCAN) 
Child Abuse Screening Tool-Parent [46] measure these po- 
tential abusive behaviors of caregiver. Consequently, what 
they measure is not well known and defined. Furthermore 
they can identify only child maltreatment which is direcdy 
due to the questioned parent. These problems might ex- 
plain why child maltreatment is usually recognized only 
when the child has consequences of abusive behaviors. 

Due to the lack of knowledge of the evolution of child 
maltreatment, studying the accuracy of diagnostic instru- 
ments identifying abused children early remains challen- 
ging. Research is required to define what subclinical and 
clinical abusive behaviors are and when the child maltreat- 
ment begins. A multidisciplinary approach might be ne- 
cessary to correcdy identify child maltreatment because of 
its multiple targets, the child and the caregiver. Input from 
adult psychiatry is necessary to be able to assess the 
potential abusive behaviors of caregivers. One might rea- 
sonably hypothesize that tools based on simultaneous as- 
sessment of potential abusive behaviors and health and 
development of the child could allow earlier identification 
of abused child or abusive caregiver than tools based only 
on separate assessments of the chUd or caregiver. How- 
ever, if a combined approach is likely to be more sensitive, 
it might also be less specific. Furthermore, because of the 
several types of child maltreatment and the varied conse- 
quences to children, several tests might be necessary to 
screen all types of child maltreatment. The final value of 
features used for screening will also depend on the preva- 
lence of these features. 

We reviewed studies only in French and English and 
only published studies in databases, and might have ex- 
cluded interesting research. Also, one of our inclusion 
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criteria was that the aim of the study was clearly to esti- 
mate the diagnostic accuracy of a test identifying abused 
children. This might have disqualified some studies in 
which some parameters of diagnostic accuracy could be 
estimated. Finally, we were interested in all forms of child 
maltreatment and all types of tools and we have not speci- 
fied a particular such as emergency departments. Depend- 
ing on the context, some tools could not be applied: for 
example a test requiring a specific laboratory result if the 
laboratory exam cannot be performed routinely. Besides, 
we reviewed the evidence on the accuracy of instruments 
for identifying abused children during any stage of child 
maltreatment evolution before their death. Thus both diag- 
nostic and screening studies could be included in our re- 
view. We evaluated among the selected studies if accurate 
screening instruments were available. However the fact 
that screening test is sensitive and specific is not enough. 
The side effects, the reliability and the cost of the test 
should be also considered. Indeed before considering a 
screening program of child maltreatment, several other 
criteria need to be respected [18]. A screening program 
should also be acceptable to families and professionals. 
Negative effects for the family are consequences of false 
negatives (children identified wrongly as not abused) and 
of false positives (children identified wrongly as abused and 
parents identified wrongly as abusers). The stigmatization 
of families is an important ethical issue. Furthermore, con- 
firming the relevance of screening of child maltreatment is 
not enough, as the modalities of the program should also 
be specified, including the site; the relevant target popula- 
tion group if screening is not mass screening, the child age 
at the time of screening, and the frequency if screening is 
repeated. At last, a screening program could become use- 
less because of effective primary prevention program of 
child abuse. Several primary prevention programs, such as 
the Nurse Family Partnership [47] and the Early Start [48], 
have been proposed, but the evidence is currently insuffi- 
cient to assess the balance between benefits and harms of 
primary care interventions [49] . 

Conclusions 

There is very scarce and low-quality evidence on the ac- 
curacy of instruments for identifying abused children. 
Child maltreatment is mostly identified when children 
have already serious consequences and the sensitivities 
and specificities of tools are inadequate. Before consider- 
ing a screening program of child maltreatment, better 
knowledge on the beginning of child maltreatment and 
development of valid screening instruments at subclin- 
ical stages remain necessary. 
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